Goto

Collaborating Authors

 contrastive learning


Hierarchical Contrastive Learning for Multimodal Data

Li, Huichao, Yu, Junhan, Zhou, Doudou

arXiv.org Machine Learning

Multimodal representation learning is commonly built on a shared-private decomposition, treating latent information as either common to all modalities or specific to one. This binary view is often inadequate: many factors are shared by only subsets of modalities, and ignoring such partial sharing can over-align unrelated signals and obscure complementary information. We propose Hierarchical Contrastive Learning (HCL), a framework that learns globally shared, partially shared, and modality-specific representations within a unified model. HCL combines a hierarchical latent-variable formulation with structural sparsity and a structure-aware contrastive objective that aligns only modalities that genuinely share a latent factor. Under uncorrelated latent variables, we prove identifiability of the hierarchical decomposition, establish recovery guarantees for the loading matrices, and derive parameter estimation and excess-risk bounds for downstream prediction. Simulations show accurate recovery of hierarchical structure and effective selection of task-relevant components. On multimodal electronic health records, HCL yields more informative representations and consistently improves predictive performance.


Contrastive Conformal Sets

Alkhatib, Yahya, Tay, Wee Peng

arXiv.org Machine Learning

Contrastive learning produces coherent semantic feature embeddings by encouraging positive samples to cluster closely while separating negative samples. However, existing contrastive learning methods lack principled guarantees on coverage within the semantic feature space. We extend conformal prediction to this setting by introducing minimum-volume covering sets equipped with learnable generalized multi-norm constraints. We propose a method that constructs conformal sets guaranteeing user-specified coverage of positive samples while maximizing negative sample exclusion. We establish theoretically that volume minimization serves as a proxy for negative exclusion, enabling our approach to operate effectively even when negative pairs are unavailable. The positive inclusion guarantee inherits the distribution-free coverage property of conformal prediction, while negative exclusion is maximized through learned set geometry optimized on a held-out training split. Experiments on simulated and real-world image datasets demonstrate improved inclusion-exclusion trade-offs compared to standard distance-based conformal baselines.


Rethinking Semi-Supervised Medical Image Segmentation: A Variance-Reduction Perspective

Neural Information Processing Systems

For medical image segmentation, contrastive learning is the dominant practice to improve the quality of visual representations by contrasting semantically similar and dissimilar pairs of samples. This is enabled by the observation that without accessing ground truth labels, negative examples with truly dissimilar anatomical features, if sampled, can significantly improve the performance. In reality, however, these samples may come from similar anatomical features and the models may struggle to distinguish the minority tail-class samples, making the tail classes more prone to misclassification, both of which typically lead to model collapse. In this paper, we propose $\texttt{ARCO}$, a semi-supervised contrastive learning (CL) framework with stratified group theory for medical image segmentation. In particular, we first propose building $\texttt{ARCO}$ through the concept of variance-reduced estimation, and show that certain variance-reduction techniques are particularly beneficial in pixel/voxel-level segmentation tasks with extremely limited labels. Furthermore, we theoretically prove these sampling techniques are universal in variance reduction. Finally, we experimentally validate our approaches on eight benchmarks, i.e., five 2D/3D medical and three semantic segmentation datasets, with different label settings, and our methods consistently outperform state-of-the-art semi-supervised methods. Additionally, we augment the CL frameworks with these sampling techniques and demonstrate significant gains over previous methods. We believe our work is an important step towards semi-supervised medical image segmentation by quantifying the limitation of current self-supervision objectives for accomplishing such challenging safety-critical tasks.








Exploitation of a Latent Mechanism in Graph Contrastive Learning: Representation Scattering Dongxiao He

Neural Information Processing Systems

Graph Contrastive Learning (GCL) has emerged as a powerful approach for generating graph representations without the need for manual annotation. Most advanced GCL methods fall into three main frameworks: node discrimination, group discrimination, and bootstrapping schemes, all of which achieve comparable performance. However, the underlying mechanisms and factors that contribute to their effectiveness are not yet fully understood.